home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Programming Sound Cards
/
Programming Sound Cards.iso
/
sound_25
/
voicekit.doc
< prev
Wrap
Text File
|
1995-01-01
|
42KB
|
895 lines
Digitized Voice Programmer's Toolkit for the PC
-----------------------------------------------
Version 3.0
Copyright (c) 1988-1993 Farpoint Software
Toolkit Instruction Manual
-----------------------------------------------------------------------------
Table of Contents
-----------------
I. Introduction.................................................... 2
II. Two Different Playback Methods
A. Internal PC Speaker.......................................... 2
B. Printer Port Device.......................................... 2
III. How Recordings Are Made......................................... 2
IV. Shareware Notice................................................ 3
V. Feature List for DVPT Version 3.0
A. Features Common to both Standard and Professional Versions... 3
B. Features Found Only in the Professional Version.............. 4
VI. List of Files Included in the Toolkit
A. Standard Version............................................. 4
B. Additional Files in the Professional Version................. 4
VII. The Voice Data File Editor (VDFE)............................... 5
VIII. The Playback Module (FSSPEAK.EXE)............................... 5
IX. How To Use FSSPEAK.EXE as a Transient Program................... 5
X. How To Use FSSPEAK.EXE as a TSR (Memory-resident) Program....... 7
XI. The Example Programs........................................... 10
XII. Other Programs Included in the Toolkit......................... 10
XIII. How To Reduce Disk Space Requirements.......................... 12
XIV. The EAR-WIZARD Playback Device................................. 13
XV. Playback Through the PC Speaker................................ 13
XVI. Speed Requirements for the Target Machine...................... 13
XVII. General Information About the Digitizer (Recorder)............. 14
XVIII. The Signal Level Indicators on the Digitizer................... 15
XIX. Using the EAR-WIZARD as a Security Device.......................15
XX. Pricing........................................................ 16
---------- DVPT version 3.0 Instruction Manual ---------- page 2 ----------
I. Introduction
------------
This toolkit is a combination of software and hardware designed for the
purpose of mechanizing and simplifying the process by which programmers may
create digitized voice recordings, store them on disk, edit the voice data
files, and incorporate digitized voice playback into their own high-level
language programs.
This is the third major release of the Digitized Voice Programmer's Toolkit.
The most obvious changes compared to version 2 include a complete change in
the programing interface to avoid the necessity of linking our code into
your program, the availability of a NEW INEXPENSIVE SOUND OUTPUT DEVICE
(the EAR-WIZARD) that plugs into the printer port, and the ability to
directly play compressed sound files. In addition, the playback code will
now function in a DOS box under enhanced-mode Microsoft Windows.
II. Two Different Playback Methods
------------------------------
A. Internal PC Speaker
This method is of course the least expensive option. It has the
advantage of working on the majority of PCs of the 286 class or faster,
and requires NO SPECIAL HARDWARE at the end user's site; you ship only
software. The disadvantage of this method is that the quality of
speakers in PCs varies considerably. In many cases the sound will be
distorted or low-volume.
B. Printer Port Device
This is an inexpensive device from Farpoint known as the EAR-WIZARD.
This is a small (0.65" by 1.90" by 2.17") block that plugs directly
into any parallel printer (LPT) port. It has a 3.5mm stereo-type phono
jack which can drive either a set of headphones or an external speaker.
An amplified speaker gives the best sound; however, the EAR-WIZARD is
capable of driving non-amplified speakers to a reasonable volume level.
A volume control is provided with enough range to accomodate all of the
above output devices. The advantage to using the EAR-WIZARD is an
improvement in both sound quality and consistency of results. Even
laptop PCs, which normally have very poor internal speakers, will
produce the same result with the EAR-WIZARD as any desktop PC.
III. How Recordings Are Made
-----------------------
The recording of digitized voice requires an external hardware device,
available from Farpoint Software. Note to users of DVPT version 2:
Your board will work; no changes have been made to the recording hardware.
---------- DVPT version 3.0 Instruction Manual ---------- page 3 ----------
IV. Shareware Notice
----------------
The Digitized Voice Programmer's Toolkit is released as Shareware. This is
copyrighted material; it is NOT "free software". You are permitted to
experiment with this package long enough to determine if it suits your needs,
up to a limit of 30 days, but if you will be making use of the material in
your own programs, then a license fee is required. NO PROGRAM WHICH MAKES
USE OF THE MATERIALS IN THIS TOOLKIT MAY BE SOLD COMMERCIALLY, ON A CONTRACT
BASIS, OR AS SHAREWARE UNLESS THE SELLER HAS PAID THE LICENSE FEE. Details
of pricing for licensing and various hardware components can be found at the
end of this file.
For your convenience and our records, a registration/order form is included
in the file ORDER.FRM.
You are granted permission to distribute copies of the SHAREWARE release of
the Digitized Voice Programmer's Toolkit, provided that (1) no fee is charged
for such copies, other that a nominal disk duplication fee, (2) these files
are distributed in their original, unmodified form, and (3) ALL the files in
the original archive are included with each copy. (See "List of Files" below.)
You may NOT distribute any files from a REGISTERED version of this software
package except as specified in the licensing agreement.
If you paid a "disk duplication fee" or other such fee to a distributor of
public domain and shareware programs, be aware that the payment of this fee
does not constitute registration of this Toolkit. Likewise, the payment of a
fee to any Bulletin Board Service for the time required to download this
Toolkit does not constitute registration. Registration occurs only through
direct interaction with Farpoint Software.
If more information is needed, write or contact Alan D. Jones through
Compuserve Information Service at user ID [74030,554], or call Farpoint
Software at 713-332-3782 (voice) or 713-332-4730 (fax).
V. Feature List for DVPT version 3.0
---------------------------------
A. Features Common to both Standard and Professional Versions
1. Operates in the DOS environment.
2. Provides a redistributable (with license) executable module that
performs all playback functions. This is an EXE file that is intended
to be shipped to the end user with your application. Unlike previous
versions of DVPT, it is no longer necessary to compile and link our
code into your program. You can write your application in any language
that has the capability of executing external programs.
3. The playback module is capable of being used either as a TSR driver
(removable from memory on demand), or as a conventional transient
program.
4. There are no length limitations on either the size of the memory
buffers or the size of the voice data files on disk other than the
physical limits of the machine itself. 64k is not a special number.
---------- DVPT version 3.0 Instruction Manual ---------- page 4 ----------
5. Playback is accomplished either through the PC speaker or through
Farpoint's EAR-WIZARD device plugged into a printer port.
6. A sophisticated voice data file editor is provided. It provides a file
indexing feature that can be used to directly create "sound dictionary"
files of a format that is relatively easy for the programmer to
manipulate.
7. Short example programs are included, written in C, which demonstrate
the use of the playback module.
B. Features Found Only in the Professional Version
1. Sound data files can be compressed 2:1 using mu-Law compression
or 4:1 using ADPCM compression. These compressed files can be
directly played without decompressing them on disk, thus reducing
disk space requirements. This also allows a much longer continuous
sound block to be played within a given memory buffer size.
2. The playback module will play files at any sample rate from 7500 Hz
to 32767 Hz. In comparison, the Standard version operates only at
16572 Hz (compatible with version 2 files). A utility is provided
that can alter the inherent sample rate of a sound file.
VI. List of Files Included in the Toolkit
-------------------------------------
A. Standard Version
VOICEKIT.DOC -- this file
READ.ME -- instructions for printing VOICEKIT.DOC
ORDER.FRM -- a printable order form
DEMO.VOI -- just a sample of recorded voice
RUN_ME.BAT -- plays the demo
FSSPEAK.EXE -- the redistributable playback module
VDFE.EXE -- the voice data file editor
VDFE.HLP -- help file for the voice data file editor
EXAMPLE1.C -- shows how to use FSSPEAK as a transient program
EXAMPLE1.EXE -- compiled executable of EXAMPLE1.C
EXAMPLE2.C -- shows how to use FSSPEAK as a TSR
EXAMPLE2.EXE -- compiled executable of EXAMPLE2.C
ULAW.EXE -- compresses 8-bit PCM files to 4-bit uLaw
UNULAW.EXE -- decompresses 4-bit uLaw files to 8-bit PCM
ADPCM.EXE -- compresses 8-bit PCM files to 2-bit ADPCM
UNADPCM.EXE -- decompresses 2-bit ADPCM files to 8-bit PCM
ADJUST.EXE -- tool for adjustment of the voice recorder
If you received the Toolkit with any of the above files missing, please
notify Farpoint Software.
B. Additional Files in the Professional Version
FSSPEAK.EXE -- playback module with additional features
RESAMPLE.EXE -- changes the actual sample rate of a PCM file
MAKEWAVE.EXE -- creates WAV-format files from Farpoint PCM files
LOWPASS.EXE -- digital filter used by RESAMPLE & MAKEWAVE
---------- DVPT version 3.0 Instruction Manual ---------- page 5 ----------
VII. The Voice Data File Editor (VDFE)
---------------------------------
This program provides a convenient environment for creating, editing, and
generally patching together voice data files, and creating index files for
them. VDFE requires no command line parameters. Upon execution, it displays
its primary screen and waits for user input. This consists of an assortment
of single keystroke commands, accessible directly or through a keyboard
operated pulldown menu system. The editor and its operation are described
in detail in the file VDFE.HLP; this file can be displayed by pressing the
<F1> key while executing VDFE.
NOTE: The VDFE editor operates only with uncompressed data files, at a fixed
sample rate of 16572 Hz. Even if you plan to ship compressed sound data files,
always keep your original uncompressed files for editing or other
reprocessing. Uncompressed files can be recovered from compressed files,
but the original sound quality will not be restored.
VIII. The Playback Module (FSSPEAK.EXE)
---------------------------------
Almost any high-level language or application-generation system supports
some method by which other (external) programs can be executed. FSSPEAK.EXE
is used in this manner. It accepts an assortment of command-line parameters
to invoke its various functions and modes of operation. Command line
parameters must appear in the order shown and must be separated by one or
more blanks. A parameter enclosed in square brackets is optional; the square
brackets do not actually appear in the command line. Angle brackets are shown
here as delimiters for individual named parameters, but do not actually
appear in the command line.
FSSPEAK will operate in a DOS box under enhanced-mode Microsoft Windows if
the following condition is met: The user's SYSTEM.INI file must be modified
to include the line "TrapTimerPorts=False" in the "[386Enh]" section. Note
that the END USER must do this, so your software package needs to explain
this to the user if you intend your program to be capable of operating
under Microsoft Windows. Be aware that, in this environment, the playback
cannot be stopped with a keystroke as it can under DOS.
IX. How to Use FSSPEAK.EXE as a Transient Program
---------------------------------------------
This is the simplest way to use FSSPEAK, although it produces more disk
activity and therefore slightly slower responsiveness.
There are two types of invocations required. First, as an initialization
procedure, a "calibration file" must be created by FSSPEAK. This requires
the application to either specify the output device or use the auto-locate
function to find an EAR-WIZARD if one exists. If specified by the
application, the choices are: (a) the internal speaker, (b) an EAR-WIZARD
on LPT1, (c) an EAR-WIZARD on LPT2, or (d) an EAR-WIZARD on LPT3. If the
user changes the desired output device, the calibration must be run again.
The full command line looks like this:
---------- DVPT version 3.0 Instruction Manual ---------- page 6 ----------
FSSPEAK.EXE /Sdc <sps> <cal file>
* "d" is a number from 0 to 3 where 0 = internal speaker, 1 = LPT1,
2 = LPT2, and 3 = LPT3. Alternatively, placing an 'A' in this
position causes the program to automatically locate an EAR-WIZARD
on any LPT port if one exists. If no EAR-WIZARD is found, then the
internal PC speaker is used. Placing a 'B' in this position also
invokes auto-locate, but returns an error if no EAR-WIZARD is found.
+ "c" is the compression algorithm where 0 = no compression, 1 = uLaw,
and 2 = ADPCM.
+ "sps" is the sample rate (in samples per second) of the stored data.
* "cal file" is the name of the file which is to be created (or
overwritten). The file will contain the calibration results.
Parameters in the above list marked with "+" have no effect in the Standard
Version. They must still be present as place holders in the command line.
The compression algorithm will always be "no compression" and the sample
rate will always be forced to 16572 Hz.
Once a calibration file has been created, unlimited playback invocations
can be executed without re-calibration.
The second type of invocation is the playback itself. The command line
looks like this:
FSSPEAK.EXE /P <cal file> <data file> [<index file> <index>]
* "cal file" is the name of the calibration file created earlier with
the /S option.
* "data file" is the name of the file containing voice data.
* "index file" names the block index file created with the "VDFE"
editor.
* "index" is a decimal number indicating which block to play.
The last two parameters are included only if you are using an index file.
without an index file, the entire data file is played.
IMPORTANT: There must be enough free memory to load both FSSPEAK and the
sound data when you do this. FSSPEAK will occupy about 40k. The memory
required by the sound data is equal to the file size (or the size of the
indexed block).
Example: You want to play the 3rd block, then the 5th block, from a sound
data file called SOUND.VOI with its corresponding index file SOUND.NDX.
The files are sampled at 11025 Hz and uLaw compressed. The output is to be
through an EAR-WIZARD on port LPT2:
Calibrate:
FSSPEAK.EXE /S21 11025 calib.fil
|| | |
LPT2--- | | |
| | |
uLaw compression--- | |
| |
sample rate--- |
|
name for calibration file---
---------- DVPT version 3.0 Instruction Manual ---------- page 7 ----------
Play block 3:
FSSPEAK.EXE /P calib.fil sound.voi sound.ndx 3
Play block 5:
FSSPEAK.EXE /P calib.fil sound.voi sound.ndx 5
Example: You want to play an uncompressed file without indexing, recorded
with VDFE at 16572 Hz, through an automatically located EAR-WIZARD or the
PC's internal speaker if there is no EAR-WIZARD. Again assume the file to
be named SOUND.VOI:
Calibrate:
FSSPEAK.EXE /SA0 16572 calib.fil
Play the file:
FSSPEAK.EXE /P calib.fil sound.voi
FSSPEAK.EXE sets the DOS "exit code" based on the result of the operation.
In many languages, the "exec" call or its equivalent will return this value
to your program. In a batch file, the value appears as the value of the
variable "ERRORLEVEL". These are the possible exit codes and their meanings:
1 - Completed successfully
2 - No parameters found on command line
3 - Unable to interpret command line
6 - Sound output destination is invalid (non-existent LPT port?)
7 - Invalid compression algorithm
8 - Sample rate out of range. Range is 7500 to 32767.
9 - Error writing calibration file
10 - Error reading calibration file
11 - Error reading data file
12 - Error reading index file
13 - Specified block number does not exist in index file
14 - This CPU is too slow to function correctly
15 - Attemped to run under Windows without "TrapTimerPorts=False"
16 - Insufficient memory available to load sound data
19 - DOS version number too low (must be 3.1 or higher)
X. How to Use FSSPEAK.EXE as a TSR (Memory-resident) Program
---------------------------------------------------------
Using FSSPEAK.EXE as a resident driver results in lower memory consumption
(resident code occupies 11k) and slightly faster response times since the
EXE file does not need to be loaded for each playback event.
FSSPEAK.EXE can be either loaded before your application (in a batch file)
or by the application itself. To load it into memory, the command line looks
like this:
FSSPEAK.EXE /L [nnnn:nnnn]
* "nnnn:nnnn" (optional) is the hex representation of a far pointer
to a 4-byte buffer where the entry point of the TSR will be written.
(Remember not to put the square brackets in the actual command line.)
The far pointer spec cannot be used easily from batch files.
---------- DVPT version 3.0 Instruction Manual ---------- page 8 ----------
After all operations are finished, FSSPEAK.EXE can be removed form memory
by executing it as follows:
FSSPEAK.EXE /U
The DOS "exit code" from these two operations can have the following values:
0 - Successfully loaded into memory
2 - No parameters found on command line
3 - Unable to interpret command line
4 - Attempt to load twice; already in memory
5 - Attempt to remove from memory but program is not resident
17 - Cannot remove from memory because a later program hooked interrupt 2Fh
18 - Cannot remove from memory; unable to deallocate resident memory block
19 - DOS version number too low (must be 3.1 or higher)
As a TSR, the program accepts commands through int 2Fh with a multiplex
number of 0EEh. At entry to every call, DS:BX must point to the ASCII
string "FSSPEAK30@@@@@@@" and the multiplex number must be in AH. The
individual functions (passed in AL) are:
00h: check if program is in memory
AL returns as 0FFh if program is present
01h: remove program from memory
02h: get entry point for far calls
Entry point address is returned in DX:AX.
You don't need to call any of the above int 2Fh functions if you loaded
FSSPEAK.EXE with the far pointer option. If it was loaded without this
option, or the pointer to the TSR entry point has somehow been lost, then
the following assembly code can be used to obtain it:
.data
; this is DGROUP
signature db 'FSSPEAK30@@@@@@@'
.code
mov ax,DGROUP
mov ds,ax
lea bx,signature
mov ax,0EE02h
int 2Fh
After this sequence, the TSR's entry point will be in the DX:AX register
pair.
The TSR accepts far calls to its entry point. All parameters are passed on
the stack. All pointers are far pointers. The Pascal calling convention is
used. This is an example (in C) of a function prototype for the TSR entry
point:
long (pascal __far *FSSPEAK_ENTRYPOINT)(int command, unsigned int parm2,
unsigned char __far *parm3, long parm4);
---------- DVPT version 3.0 Instruction Manual ---------- page 9 ----------
The following is a list of the four types of calls that can be made, and
the content of each of the four parameters in each type of call:
Set Sample Rate:
Parameter 1: 0001h
Parameter 2: number of samples per second
Parameter 3: NULL
Parameter 4: 0
Return value: 0 if success, 1 if invalid parameter,
-1 if feature not available
Set Compression Algorithm
Parameter 1: 0002h
Parameter 2: 0 = no compression
1 = 4-bit uLaw
2 = 2-bit ADPCM
Parameter 3: NULL
Parameter 4: 0
Return value: 0 if success, 1 if invalid parameter,
-1 if feature not available
Set Destination and Calibrate
Parameter 1: 0003h
Parameter 2: 0FFFFh = auto-locate EAR-WIZARD; use internal
speaker if not found
0 = internal speaker
1 = LPT1
2 = LPT2
3 = LPT3
Parameter 3: NULL
Parameter 4: (meaningful only when using auto-locate)
0 = use internal speaker if no EAR-WIZARD
1 = return error if no EAR-WIZARD
Return value: 0 if success, 1 if invalid parameter,
2 if CPU too slow,
3 if under Windows without TrapTimerPorts=False,
4 if EAR-WIZARD not found (see Parameter 4)
Play Sound from Memory Buffer
Parameter 1: 0004h
Parameter 2: 0
Parameter 3: far pointer to data buffer
Parameter 4: 32-bit integer; count of bytes in buffer
Return value: Number of bytes played
The "Set Sample Rate" and "Set Compression Algorithm" calls will always
return -1 in the Standard Version.
IMPORTANT: For the Professional Version: if the sample rate is changed
and/or the compression algorithm is changed, then the "Set Destination and
Calibrate" call must be made again before any further "Play Sound" calls.
Notice that the TSR does not perform memory allocation or file I/O
operations. When using the module in this mode, your application is
responsible for reading the desired sound data into memory.
---------- DVPT version 3.0 Instruction Manual ---------- page 10 ---------
XI. The Example Programs
--------------------
EXAMPLE1.C and EXAMPLE1.EXE:
This is a working commented example, written in C, showing how to use the
FSSPEAK.EXE module as a transient program, loading and executing it each
time sound playback is required. EXAMPLE1 accepts an optional command line
parameter indicating which LPT port is connected to the EAR-WIZARD. If no
parameters are specified on the command line, then the program uses the
auto-locate feature to determine the destination. A command-line parameter
of "SPKR" forces the use of the internal PC speaker. Examine the C file for
details.
EXAMPLE2.C and EXAMPLE2.EXE:
This example appears externally to perform the same function as EXAMPLE1;
however, EXAMPLE2 uses the FSSPEAK.EXE module as a TSR (memory-resident)
program. EXAMPLE2 itself actually performs the load and unload operation
on FSSPEAK.EXE. Another option is to load and unload in a batch file,
allowing your executable to get by with only calls to the resident entry
point (obtained with Int 2Fh function 0EE02h).
RUN_ME.BAT:
This is an example of the use of FSSPEAK.EXE as a transient program in
the simplest possible mode: commands in a batch file.
XII. Other Programs Included in the Toolkit
--------------------------------------
ULAW.EXE:
This program reduces the size of a voice data file by a 2:1 ratio by
encoding each 8-bit sample as a 4-bit number such that the quantization
error is minimized near the equilibrium (silence) point (80 hex) and
increases with increasing excursion. This is a common practice in telephone
systems used to reduce data throughput requirements. It causes surprisingly
little degradation of the sound quality, although there is some. The effect
of quantizing with this program has been described as a slightly "gritty"
quality in the reproduced speech.
To use ULAW, type:
ULAW <source filename> <destination filename>
UNULAW.EXE:
Decompresses files compressed with the ULAW.EXE program. Note that this
does not recover the full sound fidelity of the original file; this, like
almost every audio compression technique, is "lossy". Some of the original
information is lost in the compression process.
To use UNULAW, type:
UNULAW <source filename> <destination filename>
---------- DVPT version 3.0 Instruction Manual ---------- page 11 ---------
ADPCM.EXE:
This program reduces the size of a voice data file by a 4:1 ratio using
ADPCM (Adaptive Differential Pulse Code Modulation). Like most compression
schemes, some distortion is produced, but understandability is not seriously
reduced. If you need to package large amounts of recorded sound, and are
willing to accept a small reduction in sound quality, this is the method
to use.
To use ADPCM, type:
ADPCM <source filename> <destination filename>
UNADPCM.EXE:
Decompresses files compressed with the ADPCM.EXE program. Note that this
does not recover the full sound fidelity of the original file; this, like
almost every audio compression technique, is "lossy". Some of the original
information is lost in the compression process.
To use UNADPCM, type:
UNADPCM <source filename> <destination filename>
RESAMPLE.EXE: (Professional version only)
If you need to import PCM files from other sources (8-bit only please),
or simply wish to alter the sample rate of a data file that has been
recorded with VDFE, this utility can accomplish that task. Reducing the
sample rate of an existing recording reduces its size at the expense of
spectral bandwidth. Increasing the sample rate of an existing recording
does not change its sound, and is normally done only in order to match
some other standard. LOWPASS.EXE must be in the current directory when
RESAMPLE is executed.
RESAMPLE uses either 4 or 6 command-line parameters. They are:
* the original (source) file name
* the destination (new sample rate) file name
* the sample rate of the original file
* the desired sample rate of the new file
* the name of the original VDFE index file
* the name of the new index file
The last two parameters are optional. They are not needed if you are
not using index files. See the VDFE documentation (help file) for more
information on index files.
Here is the syntax definition for RESAMPLE.EXE:
RESAMPLE <src file> <dest file> <src rate> <dest rate>
[<src index file> <dest index file>]
MAKEWAVE.EXE: (Professional version only)
This program is similar to RESAMPLE, except that (1) a standard WAV header
is added to the destination file, and (2) index files are not supported.
LOWPASS.EXE must be in the current directory when MAKEWAVE is executed.
To use MAKEWAVE, type:
MAKEWAVE <src file> <dest file> [<src rate> [<dest rate>]]
---------- DVPT version 3.0 Instruction Manual ---------- page 12 ---------
ADJUST.EXE:
The purpose of this program is to facilitate the adjustment and testing of
digitizer boards. It has no command line parameters; simply type ADJUST to
start the program. There are three items of interest on the display:
(1) The COM port to which the digitizer is presumably attached while this
program runs. Cycle through the ports numbers by pressing <F2> until the
correct number appears.
(2) A DC offset meter which indicates the value of the average reading
from thr A/D converter relative to the "ideal" center point of 80 hex.
This reading should be adjusted to zero using R24.
(3) A signal level meter. This shows the instantaneous peak-to-peak
level being received by the A/D converter. It should be as low as you
can get it (while not speaking, of course). A reading of 10 to 20 or
less is considered reasonable. If the reading is higher, then possible
trouble sources might be: (a) high levels of room background noise from
things such as fans and air conditioners, (b) powering the digitizer with
an electrically noisy power supply (it's designed for batteries; don't
bypass this!), (c) input gain control R8 set too high, or (d) the
digitizer may be sitting too near a source of radiated electrical noise,
such as a switching power supply or fluorescent lamp.
XIII. How to Reduce Disk Space Requirements
-------------------------------------
For the ultimate level of data compression, to minimize disk space as
much as can reasonably be done without severe degradation, first RESAMPLE
a VDFE output file from 16572 to 11025 Hz (a standard rate), then compress
with ADPCM.
Assuming the original data file to be named STUFF.VOI, this would be the
command sequence:
RESAMPLE stuff.voi stuff.low 16572 11025
ADPCM stuff.low stuff.adp
The file STUFF.ADP is shipped with the application; STUFF.LOW is an
intermediate result and can be deleted.
Data files produced by VDFE are sampled at a rate of 16572 Hz. In this form,
recorded sound occupies about 971 k bytes per minute. By reducing the sample
rate to 11025 Hz, the requirement goes down to 646 k bytes per minute.
Subsequently compressing with ADPCM brings this to a low figure of 162 k
bytes per minute. In this format, a high-density 3.5 inch diskette (1.44
megabytes) can hold about 8.5 minutes of recorded sound.
---------- DVPT version 3.0 Instruction Manual ---------- page 13 ---------
XIV. The EAR-WIZARD Playback Device
------------------------------
The EAR-WIZARD is an audio playback device which converts a parallel printer
(LPT) port into a sound output port. For optimum and consistent sound
reproduction on a variety of PCs including laptops, you can (optionally)
sell an EAR-WIZARD with each copy of your application. See the "Pricing"
section at the end of this file for quantity discounts on EAR-WIZARD.
As an alternative, you might encourage end users to contact Farpoint
Software directly if they would like to buy an EAR-WIZARD instead of using
the internal speaker of the PC.
The EAR-WIZARD has dimensions of 0.65" by 1.90" by 2.17". It is in fact built
into a standard DB-25 plastic "connector hood". There is a male DB-25
connector that plugs into any parallel printer port, and a 3.5mm stereo-type
phono jack which can drive either a set of headphones or an external speaker.
A volume-control knob protrudes from one "edge" of the connector hood, giving
the user an amplitude control range from zero up to a drive level sufficient
to produce a reasonable volume level from an unamplified 8-ohm speaker. No
external power source is required; all power to operate the device is taken
from the printer port itself.
XV. Playback Through the PC Speaker
-------------------------------
The speaker on the PC and its associated driver circuitry is quite simple and
crude, having been designed primarily for creating single square-wave tones
of various audio frequencies. This speaker is typically driven by a pair of
transistors used as a current amplifier which is in turn driven directly by
the output of a TTL gate. This results in only two possibilities of voltage
across the voice coil: 0 volts and 5 volts. Any sound to be reproduced by
this system must be reduced to an approximation in the form of a stream of
constant-amplitude rectangular pulses of varying duration and frequency.
In the playback module in this package, instantaneous DC drive voltages to
the speaker are simulated by providing an alternating series of high and low
output pulses such that the ratio of high pulses to low pulses averages out
to the desired voltage. The lowpass filtering required is provided by the
human ear and the mechanical limits of the speaker.
XVI. Speed Requirements for the Target Machine
-----------------------------------------
A relatively high timing resolution is required to accomplish the playback
task. This means that there is a minimum CPU speed requirement for proper
functioning of FSSPEAK.EXE. Many factors affect the apparent speed of the
machine, including clock speed, memory access timing, cache memory, etc.
In general, the minimum speed required must at least equal that of a 10 MHz
286, so your programs should be targeted at systems with at least this level
of performance.
---------- DVPT version 3.0 Instruction Manual ---------- page 14 ---------
IMPORTANT: ALWAYS check the return value from any calibration call to
FSSPEAK. The CPU speed requirements vary depending on sample rate and
compression algorithm. It is important to know that, in order to avoid
excessive "sampling squeal", FSSPEAK actually must play each sample twice
when rates below 15000 Hz are specified. Also, the CPU speed required
increases with increasing levels of compression. A "worst case" scenario
is ADPCM compression at a rate of 14999 Hz; this requires at least a
386/33! If you know your code must work on slower CPUs, we recommend that
you stick with the default sample rate of 16572 Hz and not use playback
decompression. It is still possible to ship compressed files; your users
will need either UNULAW.EXE or UNADPCM.EXE to decompress the data files on
disk.
XVII. General Information About the Digitizer (Recorder)
--------------------------------------------------
The digitizer is a classic 8-bit successive approximation analog-to-digital
converter, combined with a microphone amplifier, anti-aliasing filter, level
indicators, sample-and-hold amplifier, and a slightly odd 4-bit wide parallel
interface to the PC serial port. Other features include a circuit which
produces a small degree of amplitude compression to improve perceived
loudness, an input gain control with a 20 decibel range (100:1), a DC offset
trim, and a remote "stop" switch. The data read from the digitizer is stored
as a simple list of 8-bit numbers, each indicating the voltage at the time of
the sample. This storage scheme is commonly called PCM or Pulse Code
Modulation.
The interface to the digitizer depends entirely on direct I/O bit manipulation
of the PC serial port, and does NOT use the receiver serial interrupt. It
should therefore work equally well connected to any PC serial port, from
COM1 through COM4. The interconnecting cable must at least connect pins
3, 4, 5, 6, 7, 8, 20, and 22. When in doubt, use a full 25-wire "straight
through" cable; a 2-3-7 or "three-wire" cable will not work. Do not use a
null modem cable.
The circuit is designed to use a "dynamic" or moving magnetic coil type of
microphone. Ceramic or piezoelectric (crystal) microphones can also be made
to work by reducing the setting of the input gain control, but these types
are often inferior to dynamic microphones. Electret and capacitor microphones
won't work unless they provide their own DC biasing. The type of handheld
microphone that often comes with cheap portable cassette tape recorders is a
good bet.
The digitizer board is designed to operate from two conventional 9-volt
batteries.
---------- DVPT version 3.0 Instruction Manual ---------- page 15 ---------
XVIII. The Signal Level Indicators on the Digitizer
--------------------------------------------
The LED labelled D8 (green) is the -6 decibel threshold indicator; it lights
when the signal exceeds one half of the maximum range of the A/D converter.
The LED labelled D7 (red) is the clipping indicator; it lights when the
signal level equals or exceeds the maximum range of the A/D converter. The
lighting of the clipping indicator means that the peaks of the reproduced
waveform will be clipped or "flat-topped". A small amount of clipping on
speech waveforms generally does not degrade intelligibility, although it
can be psychologically irritating if it occurs over a significant portion
of the data.
The level indicators are designed to be used in the following way: When
speaking into the microphone or making a recording from another source, the
input level (gain) control R8 should be adjusted so that the -6db indicator
remains lit a large percentage of the time, whereas the clipping indicator
only flickers on the loudest peaks of the sound. This is the optimum signal
level. It makes the maximum possible use of the resolution of the A/D
converter without introducing an undue amount of clipping distortion.
XIX. Using the EAR-WIZARD as a Security Device
-----------------------------------------
If you set up your application so that it will not use the internal speaker,
then the EAR-WIZARD auto-locate feature can be used as a crude "hardware
security key". You application should simply refuse to run if the calibrate
pass (with auto-locate enabled) returns an indication that there is no
EAR-WIZARD. This method is not as secure as a "real" dongle, but will thwart
many people's efforts at pirating your software. If you are producing a
shareware package, you might allow the unregistered version to use the
internal speaker (but identify itself as unregistered), while the registered
version uses only the EAR-WIZARD.
If a sufficient number of DVPT users indicate an interest in the use of
the EAR-WIZARD as a security device, we plan to produce a version which
incorporates a somewhat more serious (hard to defeat) mechanism which can
be customized for each registered DVPT user.
---------- DVPT version 3.0 Instruction Manual ---------- page 16 ---------
XX. Pricing
-------
Software registration for personal use or use within a single
organization (redistribution not permitted):
Standard Version $50 (free with purchase of 10 or more EAR-WIZARDs)
Professional Version $70 (free with purchase of 20 or more EAR-WIZARDs)
- - - - - - - - - - - - - - - - - - - - - - - - - -
Software registration and license to redistribute FSSPEAK.EXE:
Standard Version $75 (free with purchase of 20 or more EAR-WIZARDs)
Professional Version $95 (free with purchase of 30 or more EAR-WIZARDs)
- - - - - - - - - - - - - - - - - - - - - - - - - -
EAR-WIZARD printer port playback device:
Quantity 1 - 9 $25 each
Quantity 10 - 49 $22 each
Quantity 50 up $18 each
- - - - - - - - - - - - - - - - - - - - - - - - - -
Fully complete and tested digitizer (sound recorder) board with 30 day
warranty (does not include enclosure, batteries, serial cable, or
microphone):
For unlicensed users $79
With FSSPEAK redistribution license $59
- - - - - - - - - - - - - - - - - - - - - - - - - -
Incidental hardware: Although we can supply the following items, you
can obtain them through your local consumer electronics stores and
computer stores.
Headphones for use with EAR-WIZARD $11
500 ohm Dynamic Microphone $19
DB-25 cable $20
DB-9 to DB-25 adapter $10
- - - - - - - - - - - - - - - - - - - - - - - - - -
NOTICE: Since our manufacturing costs may increase as parts costs increase
over time, the prices quoted for hardware items are valid only
through December 1994.
---------- DVPT version 3.0 Instruction Manual ---------- page 17 ---------
Please address all correspondence to:
Farpoint Software
2501 Afton Court
League City, Texas 77573-3438
U.S.A.
Telephone numbers, etc.:
Voice: 713-332-3782
Fax: 713-332-4730
Compuserve ID: 74030,554
If at all possible, use the form created by printing the file ORDER.FRM.
If you choose to write in the information by hand, please print clearly.
Please make all checks and/or money orders payable to Farpoint Software.